Overview

Dataset statistics

Number of variables11
Number of observations1312
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory112.9 KiB
Average record size in memory88.1 B

Variable types

Numeric11

Warnings

schooldist is highly correlated with council and 1 other fieldsHigh correlation
council is highly correlated with schooldist and 1 other fieldsHigh correlation
zipcode is highly correlated with schooldist and 1 other fieldsHigh correlation
lotarea is highly correlated with bldgarea and 1 other fieldsHigh correlation
bldgarea is highly correlated with lotarea and 1 other fieldsHigh correlation
unitstotal is highly correlated with lotarea and 1 other fieldsHigh correlation
block is highly correlated with councilHigh correlation
schooldist is highly correlated with council and 1 other fieldsHigh correlation
council is highly correlated with block and 2 other fieldsHigh correlation
zipcode is highly correlated with schooldist and 1 other fieldsHigh correlation
landuse is highly correlated with unitstotalHigh correlation
lotarea is highly correlated with bldgareaHigh correlation
bldgarea is highly correlated with lotareaHigh correlation
unitstotal is highly correlated with landuseHigh correlation
block is highly correlated with councilHigh correlation
schooldist is highly correlated with council and 1 other fieldsHigh correlation
council is highly correlated with block and 2 other fieldsHigh correlation
zipcode is highly correlated with schooldist and 1 other fieldsHigh correlation
landuse is highly correlated with unitstotalHigh correlation
lotarea is highly correlated with bldgareaHigh correlation
bldgarea is highly correlated with lotareaHigh correlation
unitstotal is highly correlated with landuseHigh correlation
landuse is highly correlated with yearbuilt and 1 other fieldsHigh correlation
zipcode is highly correlated with schooldist and 2 other fieldsHigh correlation
lotarea is highly correlated with unitstotal and 1 other fieldsHigh correlation
yearbuilt is highly correlated with landuseHigh correlation
schooldist is highly correlated with zipcode and 2 other fieldsHigh correlation
numfloors is highly correlated with bldgareaHigh correlation
council is highly correlated with zipcode and 3 other fieldsHigh correlation
block is highly correlated with zipcode and 2 other fieldsHigh correlation
unitstotal is highly correlated with lotarea and 1 other fieldsHigh correlation
bldgarea is highly correlated with lotarea and 2 other fieldsHigh correlation
df_index is highly correlated with landuse and 1 other fieldsHigh correlation
lotarea is highly skewed (γ1 = 26.28937583) Skewed
df_index has unique values Unique

Reproduction

Analysis started2021-06-05 03:48:58.937824
Analysis finished2021-06-05 03:49:15.189469
Duration16.25 seconds
Software versionpandas-profiling v3.0.0
Download configurationconfig.json

Variables

df_index
Real number (ℝ≥0)

HIGH CORRELATION
UNIQUE

Distinct1312
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean415825.7248
Minimum7
Maximum858669
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size10.4 KiB
2021-06-04T20:49:15.262487image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum7
5-th percentile16462.3
Q1164294.5
median270229
Q3681278
95-th percentile797188.8
Maximum858669
Range858662
Interquartile range (IQR)516983.5

Descriptive statistics

Standard deviation281477.3367
Coefficient of variation (CV)0.6769117923
Kurtosis-1.64255652
Mean415825.7248
Median Absolute Deviation (MAD)261976.5
Skewness0.08677248126
Sum545563351
Variance7.922949106 × 1010
MonotonicityNot monotonic
2021-06-04T20:49:15.361509image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
757771
 
0.1%
6424421
 
0.1%
2004421
 
0.1%
2410291
 
0.1%
2695191
 
0.1%
2410271
 
0.1%
2680061
 
0.1%
6035181
 
0.1%
710371
 
0.1%
6035161
 
0.1%
Other values (1302)1302
99.2%
ValueCountFrequency (%)
71
0.1%
211
0.1%
321
0.1%
331
0.1%
3941
0.1%
4031
0.1%
4241
0.1%
4391
0.1%
4911
0.1%
8571
0.1%
ValueCountFrequency (%)
8586691
0.1%
8554451
0.1%
8536481
0.1%
8536461
0.1%
8454891
0.1%
8454341
0.1%
8452701
0.1%
8420071
0.1%
8410081
0.1%
8409791
0.1%

block
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct740
Distinct (%)56.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1191.785061
Minimum4
Maximum15638
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size10.4 KiB
2021-06-04T20:49:15.465532image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum4
5-th percentile28
Q1778.75
median1104.5
Q31350.25
95-th percentile2427.05
Maximum15638
Range15634
Interquartile range (IQR)571.5

Descriptive statistics

Standard deviation1181.705809
Coefficient of variation (CV)0.9915427267
Kurtosis45.08573256
Mean1191.785061
Median Absolute Deviation (MAD)295
Skewness5.255268283
Sum1563622
Variance1396428.619
MonotonicityNot monotonic
2021-06-04T20:49:15.568556image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1620
 
1.5%
117110
 
0.8%
11189
 
0.7%
218
 
0.6%
12758
 
0.6%
7638
 
0.6%
12697
 
0.5%
11587
 
0.5%
13746
 
0.5%
7606
 
0.5%
Other values (730)1223
93.2%
ValueCountFrequency (%)
41
 
0.1%
51
 
0.1%
64
 
0.3%
92
 
0.2%
101
 
0.1%
112
 
0.2%
131
 
0.1%
152
 
0.2%
1620
1.5%
173
 
0.2%
ValueCountFrequency (%)
156381
 
0.1%
156101
 
0.1%
101011
 
0.1%
99981
 
0.1%
74591
 
0.1%
72791
 
0.1%
72745
0.4%
72732
 
0.2%
72531
 
0.1%
72501
 
0.1%

schooldist
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct27
Distinct (%)2.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.256097561
Minimum1
Maximum31
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size10.4 KiB
2021-06-04T20:49:15.673579image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q12
median2
Q32
95-th percentile18.35
Maximum31
Range30
Interquartile range (IQR)0

Descriptive statistics

Standard deviation6.058320124
Coefficient of variation (CV)1.423444843
Kurtosis8.845706856
Mean4.256097561
Median Absolute Deviation (MAD)0
Skewness3.062523127
Sum5584
Variance36.70324273
MonotonicityNot monotonic
2021-06-04T20:49:15.753597image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=27)
ValueCountFrequency (%)
2983
74.9%
3100
 
7.6%
1339
 
3.0%
3031
 
2.4%
1421
 
1.6%
119
 
1.4%
517
 
1.3%
2113
 
1.0%
1512
 
0.9%
710
 
0.8%
Other values (17)67
 
5.1%
ValueCountFrequency (%)
119
 
1.4%
2983
74.9%
3100
 
7.6%
49
 
0.7%
517
 
1.3%
610
 
0.8%
710
 
0.8%
82
 
0.2%
95
 
0.4%
107
 
0.5%
ValueCountFrequency (%)
311
 
0.1%
3031
2.4%
2810
 
0.8%
272
 
0.2%
253
 
0.2%
241
 
0.1%
233
 
0.2%
221
 
0.1%
2113
1.0%
201
 
0.1%

council
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct37
Distinct (%)2.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7.041158537
Minimum1
Maximum50
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size10.4 KiB
2021-06-04T20:49:15.843617image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q13
median4
Q35
95-th percentile33
Maximum50
Range49
Interquartile range (IQR)2

Descriptive statistics

Standard deviation9.526060343
Coefficient of variation (CV)1.35291093
Kurtosis5.489807161
Mean7.041158537
Median Absolute Deviation (MAD)1
Skewness2.539203771
Sum9238
Variance90.74582566
MonotonicityNot monotonic
2021-06-04T20:49:15.937639image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=37)
ValueCountFrequency (%)
4438
33.4%
3212
16.2%
1178
13.6%
5110
 
8.4%
686
 
6.6%
272
 
5.5%
3354
 
4.1%
2631
 
2.4%
3517
 
1.3%
714
 
1.1%
Other values (27)100
 
7.6%
ValueCountFrequency (%)
1178
13.6%
272
 
5.5%
3212
16.2%
4438
33.4%
5110
 
8.4%
686
 
6.6%
714
 
1.1%
811
 
0.8%
912
 
0.9%
109
 
0.7%
ValueCountFrequency (%)
501
 
0.1%
489
0.7%
474
0.3%
431
 
0.1%
421
 
0.1%
411
 
0.1%
401
 
0.1%
381
 
0.1%
371
 
0.1%
361
 
0.1%

zipcode
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct97
Distinct (%)7.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean10168.26982
Minimum10001
Maximum11691
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size10.4 KiB
2021-06-04T20:49:16.033660image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum10001
5-th percentile10001
Q110016
median10022
Q310038
95-th percentile11212
Maximum11691
Range1690
Interquartile range (IQR)22

Descriptive statistics

Standard deviation372.666339
Coefficient of variation (CV)0.03664992626
Kurtosis4.111094274
Mean10168.26982
Median Absolute Deviation (MAD)11
Skewness2.405819607
Sum13340770
Variance138880.2002
MonotonicityNot monotonic
2021-06-04T20:49:16.129683image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
10022112
 
8.5%
1001796
 
7.3%
1001990
 
6.9%
1001683
 
6.3%
1003671
 
5.4%
1001870
 
5.3%
1000168
 
5.2%
1002361
 
4.6%
1012840
 
3.0%
1120137
 
2.8%
Other values (87)584
44.5%
ValueCountFrequency (%)
1000168
5.2%
1000214
 
1.1%
1000315
 
1.1%
1000423
 
1.8%
1000528
2.1%
1000620
 
1.5%
1000726
 
2.0%
100092
 
0.2%
1001029
2.2%
1001116
 
1.2%
ValueCountFrequency (%)
116912
 
0.2%
114351
 
0.1%
114331
 
0.1%
114151
 
0.1%
113791
 
0.1%
113755
0.4%
113742
 
0.2%
113651
 
0.1%
113552
 
0.2%
1124910
0.8%

landuse
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct7
Distinct (%)0.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.22027439
Minimum1
Maximum8
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size10.4 KiB
2021-06-04T20:49:16.217701image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile3
Q14
median4
Q35
95-th percentile5
Maximum8
Range7
Interquartile range (IQR)1

Descriptive statistics

Standard deviation0.9322351204
Coefficient of variation (CV)0.2208944334
Kurtosis2.976752232
Mean4.22027439
Median Absolute Deviation (MAD)1
Skewness0.8845470006
Sum5537
Variance0.8690623198
MonotonicityNot monotonic
2021-06-04T20:49:16.285717image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
5499
38.0%
4483
36.8%
3303
23.1%
824
 
1.8%
61
 
0.1%
21
 
0.1%
11
 
0.1%
ValueCountFrequency (%)
11
 
0.1%
21
 
0.1%
3303
23.1%
4483
36.8%
5499
38.0%
61
 
0.1%
824
 
1.8%
ValueCountFrequency (%)
824
 
1.8%
61
 
0.1%
5499
38.0%
4483
36.8%
3303
23.1%
21
 
0.1%
11
 
0.1%

lotarea
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
SKEWED

Distinct1198
Distinct (%)91.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean44862.3407
Minimum1506
Maximum5048550
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size10.4 KiB
2021-06-04T20:49:16.377738image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum1506
5-th percentile4938
Q111146.75
median21579
Q341603.5
95-th percentile139416.3
Maximum5048550
Range5047044
Interquartile range (IQR)30456.75

Descriptive statistics

Standard deviation154873.9803
Coefficient of variation (CV)3.452204629
Kurtosis833.9838104
Mean44862.3407
Median Absolute Deviation (MAD)12501.5
Skewness26.28937583
Sum58859391
Variance2.398594976 × 1010
MonotonicityNot monotonic
2021-06-04T20:49:16.484762image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
74068
 
0.6%
98757
 
0.5%
100426
 
0.5%
60256
 
0.5%
75005
 
0.4%
60244
 
0.3%
50214
 
0.3%
75314
 
0.3%
241004
 
0.3%
125524
 
0.3%
Other values (1188)1260
96.0%
ValueCountFrequency (%)
15062
0.2%
19421
 
0.1%
20251
 
0.1%
21431
 
0.1%
21501
 
0.1%
22092
0.2%
24681
 
0.1%
24691
 
0.1%
24751
 
0.1%
25103
0.2%
ValueCountFrequency (%)
50485501
0.1%
8568001
0.1%
8339451
0.1%
7469561
0.1%
6593751
0.1%
6227001
0.1%
5397301
0.1%
5192201
0.1%
3931001
0.1%
3756501
0.1%

bldgarea
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct1295
Distinct (%)98.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean470078.9345
Minimum1344
Maximum13540113
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size10.4 KiB
2021-06-04T20:49:16.766834image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum1344
5-th percentile75955.9
Q1185485.25
median325086.5
Q3547810
95-th percentile1354649.5
Maximum13540113
Range13538769
Interquartile range (IQR)362324.75

Descriptive statistics

Standard deviation612529.6512
Coefficient of variation (CV)1.303035738
Kurtosis185.4988725
Mean470078.9345
Median Absolute Deviation (MAD)164582
Skewness10.25882396
Sum616743562
Variance3.751925736 × 1011
MonotonicityNot monotonic
2021-06-04T20:49:16.878860image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
3326083
 
0.2%
6238063
 
0.2%
742912
 
0.2%
3709902
 
0.2%
506482
 
0.2%
1770002
 
0.2%
4310002
 
0.2%
4700002
 
0.2%
2162472
 
0.2%
964202
 
0.2%
Other values (1285)1290
98.3%
ValueCountFrequency (%)
13441
0.1%
31461
0.1%
32801
0.1%
242121
0.1%
288841
0.1%
352191
0.1%
356701
0.1%
383532
0.2%
392911
0.1%
399641
0.1%
ValueCountFrequency (%)
135401131
0.1%
88375001
0.1%
36935391
0.1%
32212371
0.1%
29073151
0.1%
28127391
0.1%
27340381
0.1%
26896351
0.1%
26361821
0.1%
25316701
0.1%

numfloors
Real number (ℝ≥0)

HIGH CORRELATION

Distinct65
Distinct (%)5.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean32.55678354
Minimum20.5
Maximum104
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size10.4 KiB
2021-06-04T20:49:16.989875image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum20.5
5-th percentile21
Q124
median30
Q338
95-th percentile55.45
Maximum104
Range83.5
Interquartile range (IQR)14

Descriptive statistics

Standard deviation11.67360651
Coefficient of variation (CV)0.3585614192
Kurtosis4.113977272
Mean32.55678354
Median Absolute Deviation (MAD)7
Skewness1.694656403
Sum42714.5
Variance136.273089
MonotonicityNot monotonic
2021-06-04T20:49:17.084897image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
21152
 
11.6%
2290
 
6.9%
2374
 
5.6%
2469
 
5.3%
2669
 
5.3%
2563
 
4.8%
3055
 
4.2%
3253
 
4.0%
2748
 
3.7%
3148
 
3.7%
Other values (55)591
45.0%
ValueCountFrequency (%)
20.52
 
0.2%
21152
11.6%
2290
6.9%
22.51
 
0.1%
2374
5.6%
23.51
 
0.1%
2469
5.3%
2563
4.8%
2669
5.3%
2748
 
3.7%
ValueCountFrequency (%)
1041
 
0.1%
1021
 
0.1%
901
 
0.1%
882
 
0.2%
821
 
0.1%
781
 
0.1%
771
 
0.1%
761
 
0.1%
736
0.5%
721
 
0.1%

unitstotal
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct499
Distinct (%)38.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean221.2073171
Minimum0
Maximum10948
Zeros6
Zeros (%)0.5%
Negative0
Negative (%)0.0%
Memory size10.4 KiB
2021-06-04T20:49:17.184920image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q136
median142
Q3306
95-th percentile705.9
Maximum10948
Range10948
Interquartile range (IQR)270

Descriptive statistics

Standard deviation385.4302059
Coefficient of variation (CV)1.742393566
Kurtosis458.2795667
Mean221.2073171
Median Absolute Deviation (MAD)119
Skewness17.01352631
Sum290224
Variance148556.4436
MonotonicityNot monotonic
2021-06-04T20:49:17.283941image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1107
 
8.2%
248
 
3.7%
320
 
1.5%
18412
 
0.9%
4010
 
0.8%
59
 
0.7%
49
 
0.7%
359
 
0.7%
798
 
0.6%
78
 
0.6%
Other values (489)1072
81.7%
ValueCountFrequency (%)
06
 
0.5%
1107
8.2%
248
3.7%
320
 
1.5%
49
 
0.7%
59
 
0.7%
65
 
0.4%
78
 
0.6%
85
 
0.4%
92
 
0.2%
ValueCountFrequency (%)
109481
0.1%
17061
0.1%
16601
0.1%
16151
0.1%
16041
0.1%
15471
0.1%
15211
0.1%
13491
0.1%
13321
0.1%
13211
0.1%

yearbuilt
Real number (ℝ≥0)

HIGH CORRELATION

Distinct116
Distinct (%)8.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1973.112043
Minimum0
Maximum2020
Zeros3
Zeros (%)0.2%
Negative0
Negative (%)0.0%
Memory size10.4 KiB
2021-06-04T20:49:17.391965image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1922
Q11962
median1981
Q32006
95-th percentile2018
Maximum2020
Range2020
Interquartile range (IQR)44

Descriptive statistics

Standard deviation99.53538339
Coefficient of variation (CV)0.0504458851
Kurtosis351.9940468
Mean1973.112043
Median Absolute Deviation (MAD)23
Skewness-17.85574525
Sum2588723
Variance9907.292547
MonotonicityNot monotonic
2021-06-04T20:49:17.496990image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
201542
 
3.2%
196338
 
2.9%
196437
 
2.8%
198735
 
2.7%
200632
 
2.4%
192930
 
2.3%
201830
 
2.3%
198629
 
2.2%
200728
 
2.1%
201928
 
2.1%
Other values (106)983
74.9%
ValueCountFrequency (%)
03
0.2%
18831
 
0.1%
18951
 
0.1%
18961
 
0.1%
18991
 
0.1%
19003
0.2%
19011
 
0.1%
19021
 
0.1%
19031
 
0.1%
19042
0.2%
ValueCountFrequency (%)
202024
1.8%
201928
2.1%
201830
2.3%
201724
1.8%
201624
1.8%
201542
3.2%
201422
1.7%
201322
1.7%
201222
1.7%
20118
 
0.6%

Interactions

2021-06-04T20:49:02.351059image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:02.449080image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:02.548102image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:02.639123image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:02.730143image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:02.830166image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:02.926188image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:03.016208image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:03.112229image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:03.199249image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:03.288269image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:03.386291image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:03.486314image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:03.599339image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:03.707363image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:03.810386image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:03.921412image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:04.028436image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:04.132459image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:04.243484image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:04.342507image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:04.445530image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:04.647584image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:04.742597image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:04.845620image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:04.942642image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:05.035663image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:05.136685image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:05.235707image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:05.330729image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:05.433752image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:05.524773image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:05.619794image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:05.725818image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:05.814838image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:05.914860image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:06.007882image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:06.096902image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:06.193923image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:06.288945image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:06.379965image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:06.477987image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:06.565007image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:06.656027image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:06.756050image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:06.854072image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:06.963097image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:07.064119image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:07.163142image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:07.269166image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:07.373189image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:07.473212image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:07.580235image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:07.677257image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:07.778280image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:07.887305image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:07.983327image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:08.203880image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:08.303903image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:08.400925image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:08.507949image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:08.610972image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:08.709994image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:08.817018image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:08.912040image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:09.010062image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:09.117086image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:09.208107image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:09.309129image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:09.404150image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:09.496171image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:09.595194image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:09.693216image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:09.786237image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:09.888260image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:09.979281image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:10.073302image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:10.176324image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:10.275347image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:10.386372image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:10.490395image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:10.590418image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:10.699442image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:10.805466image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:10.908489image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:11.018514image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:11.118536image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:11.221560image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:11.334585image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:11.422606image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:11.520627image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:11.613648image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:11.702669image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:11.799690image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:11.894711image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:11.985732image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:12.083754image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:12.169774image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:12.261795image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:12.362818image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:12.601871image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:12.706894image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:12.803916image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:12.898937image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:13.000961image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:13.101984image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:13.198005image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:13.301028image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:13.391049image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:13.485070image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:13.588093image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:13.692116image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:13.806142image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:13.917167image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:14.021190image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:14.136216image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:14.246241image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:14.351264image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:14.465291image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:14.567314image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-06-04T20:49:14.728349image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Correlations

2021-06-04T20:49:17.600013image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2021-06-04T20:49:17.761264image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2021-06-04T20:49:17.919300image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2021-06-04T20:49:18.078336image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2021-06-04T20:49:14.913392image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
A simple visualization of nullity by column.
2021-06-04T20:49:15.107435image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

df_indexblockschooldistcouncilzipcodelanduselotareabldgareanumfloorsunitstotalyearbuilt
020160613072.04.010022.05.011050.0268106.035.045.01983.0
115674213082.04.010022.05.081325.01526121.039.02.01969.0
2282941325110.011.010468.03.089622.0381213.021.0352.01967.0
32438737352.03.010018.03.034167.0440709.023.0399.02008.0
424139915762.05.010075.03.018900.0232400.030.0163.01981.0
512822215052.04.010128.04.022102.0302439.032.0212.01984.0
624114811423.06.010023.04.012778.0149314.021.0125.01989.0
77462161563827.031.011691.03.0263791.0837935.026.0606.01971.0
860341810112.04.010019.03.014250.0307549.036.0198.01940.0
92004639972.04.010036.05.015565.0426056.040.066.01988.0

Last rows

df_indexblockschooldistcouncilzipcodelanduselotareabldgareanumfloorsunitstotalyearbuilt
130264231314852.05.010021.08.039547.0757439.024.01.02015.0
130320058310242.03.010019.05.023900.0762619.035.01.01987.0
130415374113142.04.010016.08.019701.0279254.025.024.02001.0
130560807426237.017.010455.03.0166139.0422400.022.0471.01960.0
13067465859672.04.010016.03.045190.0922828.047.0764.02014.0
13072680578402.04.010018.05.04148.088551.034.0173.02018.0
130827407921706.010.010040.03.096675.0223200.021.0205.01959.0
13096807618612.04.010016.04.08400.0175687.035.0166.02008.0
13102003928112.03.010018.05.019750.0408511.022.088.01925.0
131122737810372.03.010036.05.03292.075902.029.01.02014.0